White Listing and Score Normalization for Keyword Spotting of Noisy Speech
نویسندگان
چکیده
We present a method that avoids the problem of a large vocabulary recognition system missing keywords due to pruning errors or degraded speech. The method, called white listing, assures that all tokens of all of the keywords are found by the recognizer, albeit with a low score. We show that this method far outperforms methods that attempt to increase recall by using subword models. In addition, we introduce a simple score normalization technique based on mapping the decoding score for a keyword to the probability of false alarm for that keyword. This method has the advantage that it can be estimated for all keywords with reliability, even though there might not be any examples of those keywords in the training or tuning set. This makes the scores of all keywords consistent at all ranges, which allows us to use a single consistent score for all keywords. We show that this method reduces the average miss rate by about a factor of 2 for the same false alarm rate. The method can also be used for combining multiple keyword spotting systems.
منابع مشابه
بهبود کارایی سیستم کاوشگر کلمات تلفنی با استفاده از نرمالیزاسیون امتیاز اطمینان مبتنی بر روش برنامهریزی خطی
Conventional word spotting systems determine hypothesized keywords and their confidence score using a speech recognizer. Acceptance or rejection of these keywords is intended based on comparison of their scores with a specific threshold. It has been proved that confidence score prepared by recognizer is highly dependent on sub-word structure of each keyword. So comparing assigned scores to keyw...
متن کاملSpotting Subsequences matching a HMM using the Average Observation Probability Criteria with application to Keyword Spotting
This paper addresses the problem of detecting keywords in unconstrained speech. The proposed algorithms search for the speech segment maximizing the average observation probability along the most likely path in the hypothesized keyword model. As known, this approach (sometimes referred to as sliding model method) requires a relaxation of the begin/endpoints of the Viterbi matching, as well as a...
متن کاملKeyword Spotting Using Normalization of Posterior Probability Confidence Measures
Keyword Spotting Using Normalization of Posterior Probability Confidence Measures by Rachna Vijay Vargiya Thesis Advisor: Marius C. Silaghi, Ph.D. Keyword spotting techniques deal with recognition of predefined vocabulary keywords from a voice stream. This research uses HMM based keyword spotting algorithms for this purpose. The three most important componenets of a keyword detection system are...
متن کاملRobust Keyword Spotting Using a Multi-Stream Approach
Speech recognition systems are prone to severe degradation in noisy environments due to mismatch between training and testing conditions. A multi-stream approach for keyword spotting is proposed to improve robustness in mismatched conditions. The assumption is that most real world noises are colored and do not affect the full spectrum equally, meaning certain parts of the spectrum can still pro...
متن کاملAn Efficient Keyword Spotting Techni Language for Filler Mo
The task of keyword spotting is to detect a set of keywords in the input continuous speech. In a keyword spotter, not only the keywords, but also the non-keyword intervals must be modeled. For this purpose, filler (or garbage) models are used. To date, most of the keyword spotters have been based on hidden Markov models (HMM). More specifically, a set of HMM is used as garbage models. In this p...
متن کامل